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Abstract 

The  condition  number  of  a  problem  measures  the  sensitivity  of  the  answer  to  small  chanj;es 
in  the  input.  We  call  the  problem  ill-posed  if  its  condition  number  is  infinite.  It  turns  out 
that  for  many  problems  of  numerical  analysis,  there  is  a  simple  relationship  between  the 
condition  number  of  a  problem  and  the  shortest  distance  from  that  problem  to  an  ill-posed 
one:  the  shortest  distance  is  proportional  to  the  reciprocal  of  the  condition  number  (or 
bounded  by  the  redprocal  of  the  condition  number).  This  is  true  for  matrix  inversion, 
computing  eigenvalues  and  eigenvectors,  finding  zeros  of  polynomials,  and  pole  assignment 
in  linear  control  systems.  In  this  paper  we  explain  this  phenomenon  by  showing  that  in  all 
these  cases,  the  condition  number  k  satisfies  a  differential  inequality  k'  2:  m-K^  tdong  a  cur\'e 
in  the  space  of  problems.  This  differential  inequality  can  be  explicitly  integrated  yielding  a 
maximum  distance  along  the  curve  before  which  k  must  become  infinite.  Similarly,  in  many 
cases  we  show  k  satisfies  another  differential  inequality  k'  ^  Mk^  which  yields  a  lower 
bound  on  the  distance  to  the  nearest  iU-posed  problem  in  terms  of  the  reciprocal  of  the 
condition  number.  The  attraction  of  this  approach  is  that  it  uses  local  information  (the 
gradient  of  a  condition  number)  to  answer  a  global  question:  how  far  away  is  the  nearest  ill- 
posed  problem?  In  addition  to  deriving  many  of  the  best  known  bounds  for  matrix  inversion, 
eigendecompositions  and  polynomial  zero  finding,  we  derive  new  bounds  on  the  distance  to 
the  nearest  polynomial  with  multiple  zeros  and  a  new  perturbation  result  on  pole  assignment. 


1.  Introdnction 

The  condition  number  of  a  problem  measures  the  sensitivity  of  ihe  answer  to  small 
changes  in  the  input.  We  call  the  problem  ill-posed  if  its  condition  number  is  infinite.  The 
ill-posed  problems  typically  form  a  lower  dimensional  surface  in  the  space  of  problems.  It 
turns  out  that  for  many  problems  of  numerical  analysis,  there  is  a  simple  relationship 
between  the  condition  number  of  a  problem  and  the  shortest  distance  from  that  problem  to 
the  siu'facc  of  iU-posed  ones:  the  shortest  distance  is  proportional  to  the  reciprocal  of  the 
condition  number  (or  bounded  by  the  reciprocal  of  the  condition  number).  This  is  true  for 
matiLx  inversion,  computing  eigenvalues  and  eigenvectors,  finding  zeros  of  polynomials,  and 
pole  assignment  in  linear  control  systems. 

For  example,  in  the  case  of  matrix  inversion,  if  A  is  perturbed  to  A  +  iA,  then  to  first 
order  the  solution  y4~^  becomes  A" ^-(-X  where 


JM. 


I1A-^||-||M|| 


(llll  is  any  operator  norm).  Thus  the  condition  number  of  this  problem  may  be  taken  as 
HA"-||.  It  is  well  known  ([Kahan66])  that  the  shortest  distance  from  A  to  Uie  surface  of 
singular  (ill-posed)  matrices  is  1/|!a~^||.  Sircilar  results  for  computing  eigendecompositions 
and  zeros  of  polynomials  arc  due  to  [Wilkinson72]  and  [Hough]  and  will  be  discussed  further 

below. 

In  this  paper  we  explain  this  phenomenon  and  unify  the  techniques  used  to  obtain  these 
results  by  showing  that  in  all  these  cases,  the  condition  number  k  satisfies  a  differential 
inequality 

as 

along  a  curve  in  the  space  of  problems.  Herr  s  is  arc  length  along  the  curve,  and  k(0)  is  the 
condition  number  of  the  initial  problem.  We  will  choose  the  curve  in  a  direction  that  makes 
the  K  increase  quickly.  This  differential  inequality  can  be  explicitly  integrated  yielding  a 
maximum  distance  l/(mK(0))  along  the  curve  before  which  k  must  become  infinite.  Thus, 
l/(mK(0))  is  an  upper  bound  on  the  distance  to  the  surface  of  ill-posed  problems.  Similarly, 
in  many  cases  we  show  k  satisfies  ano'Jier  differential  inequality 

— —  ^  M-K" 

as 

which  yields  a  lower  bound  1/(Wk(0))  on  the  distance  to  the  nearest  ill-posed  problem. 

The  attraction  of  this  approacli  is  that  it  uses  purely  local  information  (the  norm  of  the 
gradient  of  the  condition  number)  to  answer  a  global  question:  how  far  away  is  the  nearest 
point  in  a  (generally  quite  complicated)  set  of  ill-posed  problems? 

The  rest  of  the  paper  is  organized  as  follows.  In  section  2  we  present  our  differential 
inequalities  and  solve  them.  Sections  3  through  6  cover  matrix  inversion, 
eigendecompositions,  polynomial  zero  finding,  and  pole  assignment.  Section  3  through  6  may 
be  read  independently.  The  results  on  matrix  inversion  and  some  of  the  results  on 
eigendecompositions  are  known,  but  others  are  new.  One  of  our  ^  oper  bounds  on  the 
distance  from  a  polynomial  to  one  with  a  double  zero  is  known  but  *nother  is  new.  Our 
lower  bound  on  this  distance  is  new.  Our  results  on  pole  assignment  are  also  new. 
2.  DifrereDtlal  Inequalities 

The  differential  inequalities  we  need  arc  given  the  following  lemmas.  The  first  one  will 
be  used  to  derive  an  upper  bound  on  the  distance  to  the  nearest  ill-pcsed  problem  in  terms  of 
the  condition  number. 


Lemma  1:  Suppose  m>0,  yo>0,  a>l  and 

d 


^>(5)am>-(j)       ,      >'(0)=>o   . 


Then  y{s)  becomes  infinite  for  some  s  satisfying 

1 


0  <  J  < 


ia-l)-m-y^ 


a-l 


Proof:  The  differential  inequality  implies  y  is  positive  and  strictly  increasing,  so  it  is  bounded 
below  by  the  solution  of 

j-2is)  =  m  r-(5)       ,       i(0)=yo 

which,  as  easily  verified,  is 

^0 


2(5) 


(1-  (a-i)myr'sy'^-'y 

Since  z(s)  has  a  pole  at  iy((a-l)  m  y^'^),  yis)  must  have  a  pole  before  thrit.  Q.E.D. 

The  next  differential  inequality  will  yield  lower  bounds  on  the  disianct  to  the  nearest 
ill-posed  problem  in  terms  of  the  condition  number. 

Lemma  2:  Suppose  A/>0,  >'(j)>0  for  all  J,  yo-^0.  3-^1  *"*'' 

j-y(s)^MyKs)       ,     y(0)=yo   • 
Then  y(s)  is  finite  for 

0  ^  s  < 


0-l)-Myr' 
Proof:  Since  y(s)  is  positive,  it  is  bounded  above  by  the  solution  of 

j^z(s)  =  MzH^)       ,       x(0)=y, 
which,  as  in  the  last  lemma,  is 

^0 


r(.)  = 


(i-(p-i)A/yr^5)fe-ir' 

Since  z(s)  is  finite  for  all  s  less  than  l/((3- 1)  M  y^~^),  so  is  y(s).  Q.E.D. 
3.  Matrix  Inversion 

In  this  section  ||||  will  denote  an  arbitrary  vector  norm,  ||-||;j  the  dual  norm: 

and  |1a||  the  induced  matrix  norm: 


=  su: 


^^tl 


Let  S  be  some  set  of  matrices.  Let  dist(A  ,S)  denote  the  minimum  distance  from  the  matrix  A 

to  the  set  S:  dist(A,S)  =  inf  |lA-5||. 

As  discussed  in  the  introduction,   |lA"^||  is  a  condition  number  for  the  problem  of 
inverting  the  matrix  A .  This  is  true  because  to  first  order 


(A  +  8A)-i  =  ^-1  _  ^-ifiAA-i  +  0(||M|p)  'A-^  +  X 
so  that  for  small  8A 


JM. 


IIA-^||-||M|1  . 


I|A-^|| 

The  following  result  is  originally  due  to  [Eckart, Young]  when  ||1|  is  the  Euclidean  norm  and 
to  Gastinel  for  arbitrary  nom  (Gastinel's  proof  appeared  in  [Kahan66]): 

Theorem  1:  (Gastinel)  Let  S  be  the  set  of  singular  matrices.  Then 

dist(A,S)  =  \\A-T'   . 

i.e.  the  reciprocal  of  the  condition  number  |lA~^||  of  the  problem  of  inverting  A. 

Proof:  If  |l6A|i<|lA-^||"i  then  A  +  hA  is  invertible  since  (A  +  M)~i=(/+A"^6.4)-^A"i  and 
||A-i6A||  s  |[A-^||  ||SA||  <  1.  Therefore  dist(A,S)&|lA-^||-^.  To  show  equality  holds  choose 
X  and  V  such  that  |ir||  =  Itv^il^,  =1  and  v^A"'^x=  ||A"^||  (the  existence  of  x  and  v  follows  from 
the  definitions  of  the  norms).  Let  6A  =  -liA'^H-^  xy'".  Qearly  ||8Ai|  =  llA-^H"^.  To  see 
A  +  5A  is  singular  note  that  (A +  6A)(A"^x)==0.  Q.E.D. 

We  now  prove  this  tlieorem  using  Lemmas  1  and  2.  This  alternate  proof  is  no  simpler 
than  the  above  one,  but  iilustiatcs  the  techniques  we  use  later. 

Theorem  ?:  Let  S  be  as  in  Theorem  1.  Then 

dist(A,S)  =  l|A-i||-i  . 

Proof:  To  show  dist(A,S)  a  |lA"'ii'^  la  A{s)  be  any  smooth  path  from  A(0)=A  to  A(5o)tS 
parameterized  by  arclength  (i.e.  ||A(j)||=1).  Then 

as  /i-o  h 

A-O  h 

-  ^-^^  \^A-Hr.)  -  hA-HMU)A-Hs)\\  -  \\A-Hs)\\ 
A-o  h 

^lin,l|M-V)^>)A-^(5)||^,^-,(    ,p 
h-o  \h\ 

Applying  Lemma  2  with  M=l  and  p  =  2  implies  ||A"^(j)||  remains  finite  for  j<||A~^||~^. 
Since  the  path  A(j)  from  A  to  S  was  arbitrary,  we  have  dist(A,S)  ^  ||A"^||~^. 

To  prove  the  opposite  inequality  we  need  to  choose  a  path  A(s)  along  which  |lA~'^(j)|| 
increases  as  quickly  as  possible,  i.e.  we  need  an  integrable  vector  field  X{A),  |plf(A)I|=l, 
where  |lA-^X(A)A-^||  =  llA-^lj^.  Let  x(A)  and  y(A)  be  defined  as  in  Theorem  1: 
|ii(A)||  =  \\y^(A)\\o  =  1  and  y^(A)A--xiA)  =  \\A-%  Now  let  X(A)  =  x(A) /(A).  Assume 
for  the  moment  that  X(A)  is  integrable,  and  let  A (j)  be  an  integral  curve  parameterized  by 
arclength  such  that  A (0)= A  and  |1A"^(j)||  is  increasing.  Then  it  is  easy  to  see  that 

jj\\A-Ks)\\=\\A-KsW 

so  by  Lemma  1  (withm  =  l  and  a  =  2)  |1A"^(j)||  becomes  infinite  for  j=||A~^||"^  as  desired. 

We  show  X{A)  is  integrable  by  integrating  it  explicitly.  Its  integral  curves  are  straight 
lines  as  they  must  be  since  they  are  geodesies  in  Euclidean  space.  To  prove  this  it  suffices  to 
show  that  if  \\x\\  =  \\y^\\o  =  1  and  vM-^x  -  |lA-i||,  then /(A- try '")x  =  ||(A-txyO"^ll  for 


c  sufficiently  small.  This  foUows  from  the  Sherman- Morrison  formula  [Golub,  Van  Loan] 


1  -  t/A-  X 


SO 


and 


||(A     ^)     \\^\\A     II  +    ^  _  ^|^_^||         ^  _  ^,^^_^j 


^  l-t/A-^x  "^     "        l-€||A-i||         l-£||A-i| 


so 


y^A-^^-'x=  ||(A-txyO-MI  = 


ll^-^ll 


1  -  cJlA-^ll 

Q.E.D. 

In  this  proof,  we  explicitly  integrated  the  vector  field  X(A)  in  order  to  show  integral 
curves  existed.  This  is  not  generally  possible  or  desirable,  and  in  our  later  examples  we  only 
show  that  X(A)  is  continuous,  which  is  sufficient  for  intcgrability. 

4.  Eigenvalue  and  Eigenvector  Compatatlona 

In  this  section  we  consider  the  problem  of  computing  a  simple  eigenvalue  or 
corresponding  eigenvector  of  a  general  matrix  A.    In  this  section  we  let  !|||  denote  the  2- 

nonn|ix||-(i  ^r,F)^'' and 


/-I 


i^ii-?:s1^ 


Let  X  be  a  simple  eigenvalue  of  A ,  x  its  right  eigenvector  and  y^  its  left  eigenvector,  where 
we  normalize  so  that  y'^'x- 1.  The  projector  P  belonging  to  X  is  defined  as  xy^  and  has  norm 
||P||  =  ||x||  |[v^||.  It  is  well  known  [Wilkinson65]  that  if  we  perturb  A  by  SA,  X  can  change  at 
most  by  |6X|  <  \\P\\  ||6A||,  and  that  this  bound  is  attainable.  Therefore,  we  call  JlPH  the 
condition  number  of  the  eigenvalue  X.  It  is  also  known  ([ Wilkinson72] ,  [Kahan72])  that  the 
distance  from  A  to  a  matrix  which  has  a  double  eigenvalue  (at  X)  is  bounded  by 
II^IKIl/'IP-l)^''.  or  approximately  |Ia||/||/'||  for  large  \\P\\.  Thus  the  reciprocal  of  the 
condition  number  \\P\\  bounds  the  relative  distance  to  the  nearest  infinitely  ill-conditioned 
eigenproblem.  An  n-tuple  eigenvalue  X  is  infinitely  ill-conditioned  because  a  perturbation  of 
size  c  in  the  matrix  can  change  X  by  t^",  whose  derivative  at  €  =  0  is  infinite.  Therefore  we 
may  take  the  set  of  matrices  with  multiple  eigenvalues  as  our  surface  of  ill-posed  problems. 
Similar  considerations  show  that  the  same  surface  is  the  set  of  ill-conditioned  problems  when 
computing  eigenvectors. 

In  this  section  we  will  prove  two  similar  results  using  Lemma  1,  one  yielding  a 
perturbation  6A  of  the  same  norm  and  structure  as  in  Wilkinson's  proof,  and  the  other 
yielding  a  perturbation  with  a  different  structxire  and  approximately  the  fame  norm. 
Afterweirds,  we  will  obtain  a  lower  bound  on  the  distance  to  the  nearest  matrix  with  multiple 
eigenvalues  a  special  case  of  which  is  within  a  small  factor  of  the  best  current  bound  in  the 
literature. 

First  we  need  some  notation.  Since  our  matrix  norm  is  invariant  under  orthogonal 
transformations,  we  may  assume  without  loss  of  generality  that  our  matrix  A  is  in  Schur 
canonical  form  (Golub,  Van  Loan]: 


0    B 

Let  r  -  (B-\)~^x.  It  is  easy  to  show  that  in  this  coordinate  system  the  right  and  left 

eigenvectors    of    X     may     be    written    x  =  [1,0 0]^    and    y^  =  [1,-rf],    so    that 

(ll/'lp-l)^'-^  =  ||r||.  It  is  to    this  last  quantity  we  shall  apply  Lemma  1.  U  we  perturb  A  to 
A  +  6A ,  with 

SA 1 1    6A 1 


SA  = 


6A21  6A22 


partitioned  conf  ormaUy  with  A ,  then  to  first  order  in  6A  P  is  perturbed  to  [Kato] 

P  +  bP  =  P  +  S8AP  +  PhAS  ' 

where  5  is  the  reduced  resolvent,  or 

0  T^{B-\)-^ 


(1) 


5-  \\m{I-P){A-z)-^  = 


0     (B-X) 


-\^-l 


in  this  coordinate  system.  Expanding  (1)  yields 

r^(B-X)-i6A;i 
(fl-X)-^5A:i 


P  +  BP  = 


1    -r^ 
0     0 


(2) 


hA^f'^{B-\)-^  +  8Ai2(5-X)--  -  r'^SAjjCfl-X)-^  -  r'^{^B-\)-^hA2f^  -  r'"8A2ir'"(fl-X)-^ 

.    -iB-\)-''hA^f^ 

We  consider  two  perturbations,  one  where  only  BA^t  is  nonzero  (this  is  the  perturbation 
used  In  [Wilkinson72])  and  one  where  only  6A21  is  nonzero.  It  is  easy  to  find  the  smallest 
perturbation  of  8A22  that  makes  A  +  8A  have  a  double  eigenvalue  at  X: 

Theorem  3:  [Wilkinson72]  If  ||/'|i>l  then  there  exists  a  8A  with  only  8A22=»^0  such  that 
A  +  8A  has  a  double  eigenvalue  at  X  and 


IMll 
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2   -    nL-^ 


(Ik'lr  -  1) 


Proof:  The  6A2:  we  seek  is  the  smallest  perturbation  such  that  B-X  +  SA^i  is  singular,  which 
has  norm  USAj.H  =  ||(fl-X)-i|r^.  But  since    "       r=(fl-X)-^x, 

\\r\\^M\\{B-\)-'\\^\\A\\\\{B-\r'\\or 

as  desired.  Q.E.D. 

Now  we  prove  this  result  using  Lemma  1: 

Theorem  4:  K  |1/'||>1  then  there  exists  a  6A  with  only  8A22=i'^0  such  that  A  +  8A  has  a  double 
eigenvalue  at  X  and 


||8A||s 


ML 


(ll/'IP-i)^  ■ 

Proof:  To  apply  Lemma  1  when  only  8A 22=^0  we  need  to  compute 


(il/'+s/'ip  -  1)^-^  -  i\\p\\'  -  1)^'^  =  (II 


1    -r^-r^bA22(.B-\)-^ 
0  0 


2  _  um  _  /! 


IP  - 1) 


(II 


1  -1^ 

0     0 


2  -  ni/2 


IP  -  1) 


when  ||5A||  approaches  zero.  Letting  r„  =  r/||rl|,  it  is  easy  to  see  this  last  expression  is 

Re||r||r:6.4::(B-X)-^r. 
to  first  order  in  6X22.   Now  since  r  =  (B-\)~^x, 

Ijrlp  =  x^(B-X)-i(fl-X)-i*J=  x^(fl-X)-ir  s  ||x|I  ||(fl-X)-iFlI 
so 


ll(5-X)-V1l 


Jldli 


Itrll  \\A\\     ■ 

Theiefore  by  choosing  bA^i  to  be  a  staall  multiple  of 

rr''(5-X)-i» 

we  get 

IM2:II  IklP 


(3) 


Re\\r\\rlbA22iB-\)-% 


m\ 


This  implies  that  we  may  choose  bA  so  that  the  rate  of  change  of  (j|/»|p--l)^'^  =  ||r||  is  at 
least  (|r||-  /  |1.4||.  The  vector  field  given  by  (3)  above  is  clearly  smooth  and  integrable,  so  we 
may  let  y(s)  ■  \\r(s)\\  where  r(s)  is  computed  along  an  integral  curve  of  the  vector  field. 
Thus 


and  we  may  apply  Lemma  1  to  y(s)  with  m=  |lA  ||   ^  and  a=2  to  get  ihr  desired  upper  bound 

ML  = m 

>(o)      (liPlP-i)^-^ 

on  the  distance  to  the  nearest  matrix  with  X  as  a  double  eigenvalue.  Q.E.D. 

We  may  prove  a  similar  theorem  when  we  only  perturb  A21.  Intuitively  we  would 
expect  such  a  perturbation  to  be  at  least  as  effective  as  one  in  the  8A22  position  since  it  can 
move  both  the  eigenvalues  of  B  and  X.  This  result  has  not  appeared  in  the  literature  before. 

Theorem  5:  li  \\P\\>1  then  there  exists  a  SA  such  that  A  +  bA  has  a  double  eigenvalue  at  X 

and 


116.4 1 


Proof:  Setting  all  but  Mji  to  0  in  (2)  yields 

fl    -rH        \r^{B-\)-^bA2i   -r^iB-X)-^bA2:r^  -  r^bA2:r^iB-\)-^ 
^'^^^  "   [0      0   J  ^      (fl-X)-iSA2i  -iB-\)-^hA2^r^ 

When  6^4  is  small  the  square  of  tlie  norm  of  this  perturbed  projector  is  at  least 


3 


1  +  2Rtr^iB-\)-'bA,,  +  ||r|p  +  2|lr||Re(r^(B-X)-^6A2ir'-r.  -  r^tA^f^iB-Xy'FJ 

=  1  +  ||r|p  +  2Rcr'-(B-X)-H(lklP  +  1)  +  IklP^^ISA:, 
where  r^  =  r/\\r\\.   Now 

||2.^(«-x,-.l(IHF  .  1)  .  IHIV-nil -  |^-,|||||(|Hp!"i!.|HIV.'1-'ll  ^  NU 

where  \\A\\f  is  the  Frobenius  norm  (the  reason  for  this  choice  instead  of  the  smaller  ||A||  will 
be  dear  in  a  moment).  Thus  by  choosing  Mji  a  small  multiple  of 
Pr'-(B-X)-l[(||r|p  +  1)  +  IHlVni*  weg« 

Since  all  quantities  defining  M^i  are  analytic  [Kato]  the  vector  field  defined  by  M-^  is 
integrable.  Furthermore,  it  is  orthogonal  to  A  (in  the  trA^'fl  inner  product)  for  all  A. 
Therefore  its  integral  curves  lie  on  spheres  of  constant  \\A\\f:.  Thus,  along  an  integral  curve 
the  function  >'(j)  =  i\\P\\^  -  l^^  =  ||r(j)||  satisfies 

A.(,)  ^  J^1£L 
ds^^'^       2\\A\\, 

so  we  may  apply  Lemma  1  with  m  =  l/(2||A||/r)  and  a  =  2.  Note  that  \[\\\jr  is  constant  along 
the  ciu^'e.  Lemma  1  implies  that  there  is  a  perturbation  hAji  of  2-norm  at  most 
2i|A||;r  /  (ll^lP-1)^^  that  makes  the  eigenvalue  X  coalesce  with  another  eigenvalue  of  /\  +  8A. 
Q.E.D. 

We  turn  now  to  computing  eigenvectors.  Let  ||1|]|  denote  an  arbitrary  operator  norm. 
From  (1)  we  can  sec  that  if  A  is  perturbed  by  f>A,  P  can  be  perturbed  to  first  order  by  at 
most  \\\hP\\\  s  2|||Sjl|-i||rili-|l!8A!]|.  A  close  examination  of  (2)  shows  this  bound  can  be 
nearly  attained,  so  |||S|||  HlPlll  may  be  used  as  a  condition  number  for  P.  The  next  theorem 
will  use  Lemma  2  to  show  that  l/(!li5|!|  -HI /'HI)  is  a  lower  bound  on  the  distance  to  the  surface 
of  ill-posed  problems.  This  result  is  essentially  identical  to  other  results  in  the  literature 
([Stewart], [Demmel84]). 

Theorem  6:  The  distance  in  the  {||  HI  norm  from  A  to  the  nearest  matrix  where  X  merges  into 
a  multiple  eigenvalue  is  at  least 

1 

7-p|||-|||/'|||     ■ 

Proof:  We  need  to  compute  the  gradient  of  |||5|||  •|||/'||| .  Since  we  are  only  interested  in  an 
upper  bound,  it  will  suffice  to  use  the  first  order  bound 

1115  +  65111  ■|||/'  +  6/'|||  -  |||5|||-|||P!||  ^  |||5|M||8/>|||  +  |||65|I|-|||/'|||    . 

Following  [Kato]  we  may  compute  to  first  order 

5  +  65  =  (/-/'- 6/')(X  +  6X  -  il-P-t>P){A  +  bA))-^iI-P-hP) 

=  5  -  F6A55  -  SShAP  -  2(tr/'6A )55  +  56A5 
so  that 

P5|I|  ^5|||/'|M||5r-|||6A||l    . 
Here  we  use  the  fact  that  P  is  of  rank  1  to  bound  jtrSA/*  |:s  |||  8A  |||  •  |||  P  \\\ .    Similarly  from  (1) 


\ip\ 


2III5I 


IMI 


Therefore 


||l5+65|||-|||/'  +  8/'|||  -  Plll-lll/'lll  ^7(|||5|||-|I|F|||)2-|||M||| 


Now  let  y(s)  =  |||5(j)|||  •|il''(j)lll  where  i4(5)  is  any  smooth  curve  parameterized  by  ardength 
from>'(0)=A  to  A{so)  where  A (jq)  has  X  merged  into  a  double  eigenvalue.  Then 

jjyis)  s  7y\s) 

so  we  may  apply  Lemma  2  with  A/ =  7  and  P  =  2  to  get  that  jq,  the  distance  in  the  |{|  HI  to  the 
matrix  with  X  merged  into  a  double  eigenvalue,  satisfies 

''    r^;[s\\\-\\\F\\\  ■ 

Q.E.D. 

By  choosing  a  specific  |||  -HI  we  will  see  that  this  result  contains  the  best  current  lower 

bound  in  the  literature  as  a  :,pcdal  case  ([Stewart],  [Demmel84]).   Let 


R  = 


1     -r 
0  \\P\\I 
It  is  easy  to  see  thatRAR~^  =  diag(X3).  and  in  fact  the  condition  number 

k(«)  =  ii«iHi^-^ii==iii-ii  +  (i;piP-i)'-' 

of  R  is  the  minimum  over  all  matrices  which  block  diagonalize  A  [Demmel83l.  Now  choose 

IIIXlll  -  \\RXR-'\\   . 
Then  it  is  easy  to  see  that  the  lower  bound  of  Theoiim  G  becomw 


1 


7-5 


1 
7 


0  0 

0   (5-X)-l 


ir-li 


1    0 
0  0 


-1  _   f-^'n 


(fl-X) 


Since  \\X\\  ^  |||X|||  /  k(R),  we  see  that  a  lower  bound  in  the  ||-||  norm,  on  the  distance  from  A 
to  the  nearest  matrix  where  X  merges  into  a  double  eigenvalue  is 

7(|^||+(ll/'|P-l)^'2)    ' 
which  is  within  a  constant  factor  of  the  lower  bound  (Tj^B  —  \)  I  (4  KpII)  in  the  literature. 
5.  Polynomial  Zero  Finding 

n 

Let  p{z)  =  2)  P/z'  be  a  complex  polynomial  with  a  simple  zero  at  x:  p(x)  =  0.  Let  |1;?|| 
;-o 
denote  the  Euclidean  norm  of  its  vector  of  coefficients.  If  p  is  perturbed  by  adding  a 

sufficiently  small  polynomial  e{j)  =  ^  e{z\  then  to  first  order  p-Ve  will  have  a  simple  zero 

/-o 
at  x+8x  =  X  -  e{xyp'{x).  This  follows  from  simply  solving  the  Taylor  expansion 

0  =  (p  +  0(^  +  8;c)  =  p{x)  +  6xp'(x)  +  ^(x)  +  0{\W)  =  8^'W  +  e^x)  +  0(||HP) 

for  bx.  Thus,  we  may  use  |l//>'(x)|  as  a  condition  number  for  the  zero  x.  In  this  section  we 
will  find  relationships  between  the  reciprocal  of  the  condition  number  |p'(x)|  and  the  distance 
fromp  (measured  using  ||-||)  to  the  nearest  polynomial  where  x  merges  into  a  multiple  zero. 
An  n-tuple  eigenvalue  is  finitely  ill-conditioned  because  a  perturbation  of  p  of  norm  c  can 
cause  a  perturbation  of  x  of  size  c^"  which  has  an  infinite  derivative  at  €  =  0.  Therefore  we 


may  take  the  set  of  polynomials  with  multiple  zeros  as  our  set  of  ill-posed  problems. 

The   most  general   previous   result  relating    |p'(x)|   to  the   distance   to   the  nearest 
polynoraici]  where  x  becomes  double  is  due  to  [Hough].   Wc  need  some  notation:  if  />  is  a 

polynomial  p(z)  =  2  Pi^'  °f  degree  at  most  n,  let  £  —  [p^,  .  .  .  .p„Y  denote  the  vector  of 

(-0 

its  coefficients.  For  any  complex  number  z,  leti  -  [l,z,r^.  .  .  .  .z"]'.  Therefore,  p(z)  =  £^i. 

Also,        let        i' -  [0,l,2r,3r^ nz"'^];        thus        p'(z)  =  £V.         Fmally,        let 

i"  -  [0,0,2,67 n(n-l)z"-2];thusp"(z)  =£V. 

Theorem  7:  [Hough]  Suppose  p  is  a  polynomial  of  degree  at  least  2  and  p(x)  =  0.  Then  the 
smallest  polynomial  e  of  degree  no  greater  than  p  such  that  p  +  e  has  a  double  zero  at  x  has 
norm 

luii  = Ip'WI 


Ik' 


IklP 


V5|p'(x)|-min(l,ixp-'-) 


•i 


\'°] 

'l 

0 

■ 

-P'(x) 

'n 

Sketch  of  Proof:  This  is  an  underdetcrmined  linear  least  squares  problem 

1    X    J^      •  •  •        x'        ■  ■  ■        x"     ' 
0  1  2x    •  •  •    ix'-^    ■  ■  ■    nx"-^ 


which  can  be  solved  explicitly  for  a  solution  of  minimum  norm,  giving  the  claimed  solution. 

Our  approach  proceeds  by  computing  the  gradient  of  |l//?'(x)|  under  changes  in  p, 
subject  to  p  (x)  =  U  We  compute  this  gradient  in  the  next  lemma. 

Lemma  3:  Let  p  hn  b  polyncmia!  of  degree  at  least  1.  Let  D,  denote  the  directional 
derivative  of  l/|p'(x)|  (x  a  zero  of  p)  in  the  direction  of  the  polynomial  e,  where  |k||=L 
Then 


D   = 


b'WI 


Ret^ 


g"W    -  -iLl 


(p'(x)y        P'ix) 


Furthermore, 


ID.I  s  A  - 


Ip'WP 

Proof:  The  directional  derivative  of  1  /  \p'(x)\m  the  direction  of  a  unit  vector  c  «* 


D,  =  lira  - 

.-0     € 


1 


1 


|(p  +  e^)'(x  +  fix)|         ]p'ix)\ 
where  x+8x  is  a  zero  of  p  +  ie.  From  our  earlier  formula  for  8x  we  have 


1 


1 


D,  =  lira  —  

.^  *   1p'(x)  -  «i£k:i£i  +  „'(x)  +  o(e2)i     Ip'wi 

^  p  (x) 


=  lira 


.^     €|p'(x)|         ,      _     €g(x)p"(x)  f^xX  .2)1 

Noting  that  if  t]  is  a  small  complex  number, 


-  1 


IP 


|i+inl 


=  1  -  Rni  +  C»(hP) 


we  see  that 


D.= 


Ip'WI 


^'Wl 


Re 


'Mp-(x) 


(p-{x)y 


Illxl 
P'(x) 


Ret'" 


iP"M  i' 


[(p'(x)y        P'W 
as  claimed.  We  can  clearly  pick  a  unit  vector  e.  to  make  D,  eqioal  its  upper  bound 


A  - 


.^IM. 


\p'(xW 


Q.E.D. 

Applying  this  lemma  for  e  perpendicular  to  jt  (i.e.  e{x)  =  0)  yields  the  same  result  as  in 
Theorem  7: 

Theorem  8:  Suppose  p  is  a  polynomial  of  degree  at  least  2  and  p(x)=0.  Then  the  smallest 
polynomial  e  of  degree  no  greater  than  p  such  that  p  +  e  has  a  double  zero  at  x  has  norm 


e    = 


Ik' 


_i!£l., 


Proof:  By  choosing  e  such  that  f^i=0  in  Lemma  3  we  get  a  vector  fidd  in  the  space  of 

zero  j:  of  p  will  not  move.  The  unit  vector  e 
clearly  the  one  in  the  dlrectiou  of  the  vector 


polynomials  along  whose  integral  curves  the  zero  j:  of  p  will  not  move.  The  unit  vector  e 
which  satisfies  ^'i^O  and  maximizes  D,  is 


component  of  i'  orthogonal  to  i,  or 


e^i' 


=  II, 


_i!i: 


ikiP 


i 


II  -  nix) 


This  vector  field  is  clearly  continuous,  so  let  p,  be  an  integral  curve  parameterized  by 
arclength.  Let^(i)  =  \Vp/(x)\.  Then  from  Lemma  3  we  have 

d 


ds 


y{s)^n{x)y\s) 


so  by  Lemmas  1  and  2  y{s)  has  a  pole  at  Jo=  JP  '(^)l  /  " W.  >-e.  p,  has  a  double  zero  at  x.  To 

see  that  no  closer  polynomial  to  p  has  this  property,  note  that  under  the  constraint  that 
«^i=0,  [D,|  ^  n(jt)/lp'^(x)|,  so  by  Lemma  2  \p'(x)\/  n{x)  is  the  minimum  distance  to  a 
polynomial  with  a  double  root  at  x  as  well.  Q.E.D. 

Without  much  more  effort,  we  get  a  similar  theorem  with  a  different  constraint  on  e: 

Theorem  9:  Let  p  be  a  polynomial  of  degree  at  least  3  andp(x)  =  0.  Then  there  is  a  quadratic 
polynomial  e  of  norm  at  most  2|p'(^)l  such  that  p  +  «  has  a  double  zero.  This  double  zero 
corresponds  to  x  in  that  as  e  increases  from  0  to  1,  the  polynomial  p^-f.e  has  a  simple  zero 
which  moves  from  x  when  €  =  0  until  it  merges  with  another  zero  to  form  a  double  zero  at 

€=1. 

Proof:  The  first  three  components  of 

SP"M         , 
P'ix)        * 

are 
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[p"ix)/p\x),xp"ix)/p'ix)  -  l,x^p"ix)/p'ix)-  2r]-  [y  ,  xy-1  ,  x^y-2x]    .(4) 

For  any  x  ox  y,  one  of  the  components  of  this  last  vector  has  to  I.ave  absolute  value  at  least 
1/2.  To  show  this,  cisume  to  the  contrary  that  all  three  component  are  smaller  than  1/2.  Then 
\xy-l\<m  implies  [xy\>y2  or  \x\>Vi2\y\)>l.  Thus  [x^^-lrl^  [i|-ky-2|>l- ky-l|>l/2,  a 
contradiction.  Therefore  by  dioosing  e  a  unit  vector  pointing  the  same  direction  as  the 
vector  in  (4),  we  get 

The  vector  field  defined  by  this  choice  of  *  is  smooth  since  the  components  in  (4)  are 
smooth,  so  let  p^  be  an  integral  curve  with  zero  x,,  where  s  is  arclcngth.  Define 
y(s)  =  |l/p,(x,)|.  Thus 

so  by  Lemma  1  there  is  a  polynomial  p,   with  a  double  zero  with  SQ=\\p-pj  \\  s  2\p'{x)\. 

Q.E.D. 

Just  as  we  used  Lemma  1  to  derive  an  upper  bound  on  the  distance  to  nearest 
polynomial  with  a  multiple  zero,  wc  will  use  Lemma  2  to  derive  a  lower  bound. 

Theorem  10;  Let  p  be  a  polyisomial  of  degree  n^2.  Let  p^  be  a  continuous  map  from 
j€[0,5o]  to  tlie  space  of  polynomials  of  degree  no  greater  than  n,  such  \hatpQ=p,  and  such 
that  J  is  tlie  ardength  parameter.  Let  x^  be  a  zero  of  p^  such  that  Xj  is  a  continuous  function 
of  s  and  Xq=x.  Then  if  Xj  is  a  multiple  zero  of  p, , 

^l£l£ 


Jlp-Pji. 


-   „(n+l)-   maxdlp.lU^JIk.P"-^) 


Proof:  From  Lemma  3  we  know  that  if  i|«I|=l  then 

ID  I  ^  A  =  iig-'-'W -Jf'P'Wl 
IP'MP 

=  lliph"  -  j^'pVII 
b'WP 

-  M'-'p  -  ^'3."p\\ 

Ip'WP 
Ip'WP 

where  |||!  is  the  2-norm  of  a  matrix.    The  matrix  ii"^  -  i'x'^  is  an  n  +  1  by  n  +  1  matrix 
whose  norm  we  may  bound  simply  by 

ilil"^  -  i'i'^l  ^  lUII  •  lli"^ll  +  Ill'IP  ^n'max  (LlxP-^)   . 
This  implies 

IT  I-   "^axfLUF-^llbll 
Now  consider  the  function  y{s)  —  y\p,'{x,)\.   We  have  just  shown 
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j-y(s)^n^max(l,\x,^''-')\\pM-y\s) 
so  that  by  Lemma  2  y(s)  remains  finite  for 


'^''       2n'-   max  {\\pM,\\pM  kP""^) 

as  claimed.  Q.E.D. 

At  first  glance  it  would  seem  hard  to  apply  Theorem  10  since  sq  is  defined  in  terms  of 
itself.  In  practice,  however,  one  would  apply  the  theorem  when  it  is  possible  to  make  x  a 
multiple  zero  by  only  a  small  perturbation  in  p.  Thus,  x,  should  not  vary  much  from  x  nor 
should  \\pj\\  vary  much  from  \\p\\.  In  such  cases  an  approximate  lower  bo'ind  is  thus  provided 
by  the  expression 

Ip'MP 
2n^  ■  |lp||-max(l,!i|2''--''-) 

6.  Pole  Assignment 

The  pole  assignment  problem  is  defined  as  follows:  given  an  n  by  n  matrix  A,  an  n  by  m 
matrix  B ,  and  a  set  {X,}  of  n  complex  numbers,  find  an  «  by  m  feedback  tnaaix  F  sudi  that 
A+BF  has  eigenvalues  {\,}.  The  motivation  for  this  problem  i.i  the  fcUovirij?:  jpvcn  a  control 
system  x  =  Ajc+Bm,  choose  the  control  input  u  as  a  funtticQ  Ft  of  x  (fffdb'jcl:)  to  make  the 
matrix  of  the  controlled  system  i=  (A  +  5F)jc  have  a  ^peofied  spectrum  If  is  ■»«;11  known  that 
this  problem  has  a  solution  for  arbitrary  {X,}  if  a^id  only  if  the  pair  (A^B)  k  rontrollable,  i.e. 
[SlA5lA^S|  •  •  •  [A^'^fl]  has  full  rank  n  [Wonham].  If  the  pair  (A^)  is  not  controllable,  then 
some  eigenvalues  (called  the  uncontrollable  modes)  of  A+BF  will  be  independent  of  F  (and 
be  dgenvalues  of  A);  the  remaining  eigenvalues  can  be  :et  FTbitraxily  by  choosing  F. 

The  robust  pole  assignment  problem,  as  defined  in  [Kautsky,  Nichols,  Van  Dooren],  is  to 
find  F  subject  to  the  additional  condition  that  X,  the  eigenvector  matri  of  A+BF  =  XAX~^, 
be  8S  well  conditioned  as  possible  (here  A  =  diag(X.)).  The  condition  number 
k(^)  -  ||X|||l^"^||  (in  this  section  ||||  denotes  the  2-nonvi)  of  die  best  conditioned  X  turns 
out  to  measure  the  size  the  sensitivity  of  both  F  and  the  time  dependent  solution  of  the 
control  system  i  =  (A+BF)x.  For  example,  if  the  X,  are  distinct,  then  ||^||  will  get  larger  as 
k(X)  gets  larger  (see  [Kautsky,  Nichols,  Van  Dooren]  for  details).  Therefore,  we  shall  take 
k{X)  as  our  condition  number  for  the  robust  pole  assignment  problem. 

We  will  also  use  a  slightly  different  measure  of  distance  than  used  before: 

±sti(AM,(AM)  -  -^^^^  +  -^i^^   . 

This  distance  has  the  advantage  of  being  insensitive  to  the  scaling  of  A  and  B .  Despite  the 
asymmetry  in  its  definition,  it  clearly  defines  the  length  of  a  smooth  curve  in  (A3)  space 
unambiguously. 

As  in  previous  sections,  we  are  interested  in  relating  k(X)  to  the  distance  from  (Afi)  to 
the  nearest  ill-posed  pair  (for  which  no  X  exists).  How  do  we  characterize  the  set  S  of 
problems  (A3)  (for  fixed  {X,})  which  are  ill-posed?  From  the  above  discussion,  it  is  dear 
that  it  includes  all  uncontrollable  (A3)  where  {X,}  do^ts  not  include  the  uncontrollable  modes 
(e.g.  if  no  X,  is  an  eigenvalue  of  A). 

Using  Lemma  2,  we  will  prove  a  theorem  which  gives  a  lower  bound  on  the  distance  to 
the  set  S  of  ill-posed  problems  in  terms  of  the  reciprocal  of  the  condition  number  k(X). 
Before  doing  so  we  state  the  foUowing  lemma  from  [Kautsky, Nichols, Van  Dooren]: 
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Lemma  4:  [Kautsky.Nichols.Van  Doorcn]  Let 


where  f/=[f/o,t/J  is  orthogonal  and  Z  is  of  full  rank.  Let  X,  =  N(t/[(A-X/)),  where  N() 
denotes  the  null  space.  Then  the  /-th  column  or,  of  X  (the  right  eigenvector  oi  A+BF  for  the 
eigenvalue  X,)  satisfies  X/CXj. 

Proof:  Prcmultiply  the  equation  A+BF =XA^~^  by  I/'"  and  rearrange  to  get 


or,  taking  the  last  n-rank(B)  rows 


F  = 


{AX  -  X\)X-^ 


0  =  U\{AX  -  X\)   . 

Q.E.D. 

Let  Xf  be  a  matrix  of  orthonormal  columns  spanning  X,,  and  let  S  =  [X^\  ■  ■  ■  \X„].  If  B 
is  of  full  rank  and  X,  is  not  an  eigenvalue  of  A,  X,  will  be  n  by  m.  Lemma  4  says  that  the 
eigenvector  matrix  X  can  be  written  as 


X  =  S 


0 


(5) 


where  u,  is  a  coIuitji  vector  of  the  same  dimension  as  X,.  Since  it  is  hard  to  characterize  the 
condition  number  oi  X,  we  instead  use  the  following  lower  boimd  beised  on  S: 

L«mma  5:  Let  X  and  S  be  defined  as  above.  Then 

k(X)  ^  a^iiS)  =  WiSS*)-^''  =  Wdx^-D-T''   ■ 

Proof:  Assume  without  loss  of  generality  that  |IXl|=l.  This  dearly  implies  that  |!«(I|^1  in  (5). 
Now  let  CT  be  the  smallest  singular  value  of  S  and  v^  the  corresponding  left  singular  vector, 
i.e.  |iv'"||=l  and  |lv^5||  =  ct.  Then  it  follows  simply  that  ||v^X||  s  a.  Thus 

k(X)  ^  a^,iS)  =  ||(55*)-'ir  =  Wdx^D-T'' 

t-i 

as  claimed.  Q.E.D. 

It  is  to  cr^(5)  that  we  will  apply  Lemma  2  to  get  a  lower  bound  on  the  distance  to  the 
nearest  ill-posed  problem. 

Lemma  6:  Let  a^^„(5(A3))  be  the  value  of  a^\^(S)  for  the  S  defined  by  A  and  fl.   Let  S'  be 

the  set  of  problems  (A^)  for  which  ct^j,(5(A,B))  =  0.   Let  dist((A ,5 ) ,S)  be  defined  as 

dist((A,B),S')  -      inf     dist((A3).(A,A))    • 
Assume  B  is  of  full  rank  and  that  none  of  the  X,  is  an  eigenvalue  of  A  so  that 
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and 

K^  -  max  (||A-X,|H|(A-X,)-'II  ,  m\-\\iA-\,)-% 

are  well  defined.  Then 

dist((A^),S')  &  -7 ^^^-^ . 

Proof:  Consider  perturbations  A +  8^4  of  A  and  B  +  hB  of  B.  We  need  to  estimate  for  small  SA 
and  65  .    . 

a^y,  -  a-2,(5,))  =  ||(55*)--||  -  ||(5^6*)-^ll  "  8||(S5')-i|| 

where  S  =  S{A^')  and  55  =  5(A  +  8A,fl  +  85).  Let  AT,  denote  the  matrices  of  orthonormal 
colamns  comprising  5  =  [>fj|  •  •  •  lX„]  and  let  a  be  the  smallest  singular  value  of  S.  If  the 
perturbations  8A  and  65  yield  perturbations  hX^  in  X,,  then  S  becomes 
[Xi+SXj  •  •  •  |X„  +  8X„]  and  5J*  becomes  (to  first  order) 

SS*  +  5;  hXXj-^X.hX]  =  SS*  +  [8A',|  •  •  ■  |5XJ^*  -;   5[8^,!  •  •  •  |5XJ^ 

and  (55*)"^  becomes  (again  to  first  order) 

(55*)-^  -  (55*)--  ([8XJ  •  ■  •  \hX,\S*  +  S[8X,|  •  •  •  |8XjO  (55»)-^  . 
What  we  need  to  estimate,  then,  is 

|i(5S*)-^  {\hX.\  ■  ■  ■  |8XJ5»  +  S\hX^\  ■  ■  •  \hX„YiSS*r'\\  . 
Now  it  is  easy  to  see  that  ||5*(55*)"^||  =  ct"^  so 

11(55*)-^  ([8XJ  •  •  •  |8^J5*  +  S[hX,\  ■  ■  ■  \hX„Y(SS*)  --\\  =s  2  cf  ^  •  \\[hX,\  •  •  •  \hX„]\\ 


2  V^  o  -3 


max  \\hXi\\   /6) 


To  estimate  \\bX,\\  note  that  X,  =  N(L'[(A-X,))  =  R((A-X,)^C/ji,  where  R()  denotes  the 
coliann  space  of  (•).  Now  let  y  be  an  n  by  n  —  m  matrix  of  orthonormal  columns  and  yi^  an  n 
by  m  matrix  of  orthonormal  columns  orthogonal  to  the  columns  of  Y.  Then  if  y  is  perturbed 
to  Y+bY  (bY  in  an  arbitrary  direction),  it  is  easy  to  verify  that  Y^  is  perttu-bed  to 
n  -  YbY^Y^^. 

In  our  case  we  want  Y  to  span  R((A  —  X,)^{/J  so  we  may  take 

Y  =  iA-\iyu,(uiiA-\,){A-\,yuD-^  . 

X,  can  be  taken  as  Y^  and  so 

8y  =  (8A''t/i  +  {A-\,ybUj)(Ul(A-\,)(A-\iyuD-^  +  iA-\.)Up 

where  |lD||  is  on  the  order  of  ||6A||  and  |]8S||  and  where  the  result  of  perturbing  B  to  B  +  8fl 
is  to  perturb  C/j  to  t/i+ St/;.  Thus 

hX,  =  -K(8A''i/i  +  {A-\,ybUj(Ul(A-k,)(A-\,yuD-^'^X, 


so 


||8;r,ll  5  (||8A||  +  llA-x,|M|8(;,||)||(f/[(A-x,)(A-x,)'-t/[)-^'^|| 
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S(||M||+  |lA-X,||-||8t/,||)||(A-^)-i||    . 

We  estimate  \\bUi\\  similarly.  Let  K  =  B(B*B)~^  be  a  set  of  orthonormal  columns 
spanning  the  range  of  B.  We  may  take  Uota  Y.  Perturbing  B  to  B  +  h3  makes  an  equivalent 
change  of  hY=t,B(B*B)~^'^  -^  BD  inY,  where  |1d||  is  on  the  order  of  l|5fll|,  so 

Putting  these  estimates  together  yields 

||SX,||s  ||8A|M|(A-X,)-i||+  ||6B||-a-l,(B)-K(A-X,) 
and  substituting  into  (6)  we  get  the  following  bound  on  the  change  in  the  ||(55*)"^I|: 
S||(5S*)-i||  s  2  V;  a-^  •  max  (||8A|Hi(A-X,)-i||  +  |l8B||-a-l,(fl)-K(A-X,)) 

\\A\\  \\B\\  J     "^     "«   •  ^'J 

We  are  now  prepared  to  apply  Lemma  2.  Let  (A(s),B{s))  be  any  smooth  curve  from 
(A(0)3(0))  =  (A^)  to  (A(jo)r5(jo))€S'  parameterized  by  arclength  in  the  sense  that 

||^A(5)||         \\-^B(s)\\  _ 


ll-^WII  IIBWII 

Lct>(5)  =  cT^^(S(A{s)^{s))).  Then  from  (7)  we  see 

d 


— )^(j)  rs  2^^  K^(,)  Kgj,,  >^^(j) 


so  by  applying  Lemma  2  with  p  =  3/2  the  distance  Jq  to  set  the  S'  is  at  least 

^       1 ^        1 

V„   max  (k^(,,k^(,))  ^^(O)        V„   max  K(,)Kb(,))  (t"^ 

It  remains  to  estimate  the  maximum  of  k^(j)Kjj(,.  in  the  denominator. 
Note  that  if 


then 


and 


\\A{s)-A\\  ^    \\B{s)-B\\  ^ 
\\A\\  \\B\\  "^ 


||8A||  -  \\Ais)-A\\  s  Ti  llAll  ^  Ti  K^  ||(/l-X,)-i||-i 


(8) 


Therefore 

K^(,,  =  maz(|lA(5)||-||(A(5)-X,)-il|  ,  |lA(5)-X,|H|(A(5)-X,)-i||) 

^    ||A-X,||-||(/t-X,r^||     ^        ||8A|M|(A-X,)-^||      ^ 

s  max  ( +  ) 

'        1-  ||(A-X,)-'|1-||6A||         1-  ||(A-X,)-^|M|8A|r 
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1   +   71   K^ 

Similarly 

'^'^     cT_(fl(,))      o^{B)  -  \\m  ^"^    1  -  -n  K^   • 

Thus  (8)  implies 

(1  +  Ti  maa(K^.Kg))^ 
•^^^^>  "'^'^  ^  "^  "^^  (1  -  .,  max(K,.K,))^ 
Thus,  dist((i4^),S')  can  be  less  than  ti  only  if 

1 


^  _,  (1  +  TI  maxCK^.K^))^ 

^n  K^  Kg  a   ^ 

(1  -  11  max(Kx,K£))2 

or,  rearranging 

(V^  KJ^  Kg  (j-^  Ti)(l  +  maxCK^.K^)  -^y  ??  (i  -  m3j:(K,i,iC£)  i])^   .  (9) 

Since  cr  s  |l5||  s  V^,  k^&1  and  k^&I 

Vn  K^  K^  tT~^  a  mai(K^,K5)    , 

Thus  (9)  is  true  only  if 

(V^  K^  Kg  (T-i  ■ri)(l  +  V^  K^  K^  (T-i  11)2  ^(I-VUk^Kb  a'^  Ti)2   .  (10) 

Letting  x  =  ^  k^  Kg  a~^  r\,  we  sec  (10)  is  equivalent  to  ^(l+x)^  ^  (l-x)^,  or 
x^+x^+3x-\  &  0,  which  is  true  if  x  &  .296.  llius 

dist((A,B),S')^  ^   -^^^ 

as  was  to  be  proved.  Q.E.D. 

Combining  this  with  Lemma  5  yields 

Theorem  11:  Let  S  be  the  set  of  (A^)  where  no  nonsingxilar  matrix  X  of  eigenvectors  exists. 
Let  B  of  full  rank,  no  X,  be  an  eigenvalue  of  A,  and  k^  and  k^  be  defined  as  in  Lemma  6. 
Then 

dist((A^),S)  ^   ■/-     -^^^     ^   ^     . 

Vn   K^  Kg  k{X) 

Let  U  be  the  set  of  uncontrollable  pairs  (A  ,B).  Then  we  also  have 

dist((A^),U)  ^    ■;       -^^^     ^    ^     . 
Vn  K^  K^  k{X) 

Proof:  From  Lemma  5  we  have  k(X)  &  <j^^(S),  so  the  first  claim  follows  immediately  from 
Lemma  6.  The  second  claim  follows  since  the  set  of  uncontrollable  problems  U  is  contained 
inS.  Q.E.D. 

We  can  also  write  the  second  inequality  of  Theorem  11  es 

x-W  ^  TT . 

V„  K^K5dist((AJ5),U) 
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implying  that  the  closer  (A,fl)  is  to  being  uncontrollable,  the  larger  the  condition  number 
k(X)  of  the  problem.  Note  that  the  factors  k^  and  k^,  both  at  least  1,  tend  to  make  the  lower 
bound  on  k(X)  smaller.  The  reason  for  this  is  as  follows.  If  k(B)  is  large,  a  very  small 
perturbation  of  fl  can  change  the  space  R(fl)  spanned  by  its  columns  greatly,  in  particular  in 
such  a  way  that  the  pole  assignment  problem  becomes  quite  easy.  Therefore  we  caimot 
guarantee  that  k(X)  will  be  bad  in  this  case.  Similarly,  if  k(A)  is  large,  some  \,  is  nearly  an 
eigenvalue  of  A.  Thus,  even  if  (A^)  is  nearly  uncontrollable,  only  a  small  perturbation  may 
be  needed  to  put  a  pole  at  X,.  In  the  extreme  case  when  {X,}  is  the  spectrum  of  A,  F=0  solves 
the  pole  assignment  problem  even  if  (A  ,5)  is  exactly  uncontrollable  and  so  k{X)  depends 
only  on  how  hard  it  is  to  diagonalize  A.  A  similar  result  to  Theorem  11  was  proven  in 
[Demmel85]  using  more  ad  hoc  techniques. 
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